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APPENDIX 

1-23. (CANCELED) 

24. A computer-impkmented system for performing data mining applications, 
comprising: 

(a) a computer Wing one or more data storage devices connected thereto, wherein a 
relational database is stored on one or more of the data storage devices; 

(b) a relational database management system, executed by the computer, for accessing the 
relational database stored on sthe data storage devices; and 

(c) an analytic application programming interface (API) that generates a set of scalable data 
mining functions including queries for execution by the relational database management system, 
executed by the computer, for performing data mining operations directly within the database 
management system* 

25. The system of claim 24 above, wherein the computer comprises a parallel processing 
computer comprised of a plurality of nodes, and each node executes one or more threads of the 
relational database management system to provide parallelism in the data mining operations. 

26. The system of claim 24, wherein the scalable data mining functions process data 
collections stored in the relational database and produce results that are stored in the relational 
database, 

27. The system of claim 24, wherein the scalable data mining functions are created by 
parameterizing and instantiating the analytic API. 

28. The system of claim 24, wherein the scalable data mining functions are dyaamically 
generated queries comprised of combined phrases with substituting values therein based on 
parameters supplied to the analytic APL 

29. The system of claim 28, wherein the scalable data mining functions comprise Data 
Description functions, Data Derivation functions, Data Reduction functions, Data Reorganization 
functions, Data Sampling functions, or Data Partitioning functions. 
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30. The system of claim 29, wberein the Data Description functions comprise 
descriptive statistical functions. 

31. The system of claim 29, wherein the Data Description functions comprise: 

(1) descriptive statistics for one or more numeric columns, wherein the statistics are 
selected from a group comprising count, minimum, maximum, mean, standard 
deviation, standard mean error, variance, coefficient of variance, skewness, kurtosis, 
uncorrected sum of squares, corrected sum of squares, and quantiks, 

(2) a count of values for a column, 

(3) a calculated modality for a column, 

(4) one or more bin numeric columns of counts with overlay and statistics options, 

(5) one or more automatically sub-binned numeric columns giving additional counts and 
isolated frequently occurring individual values 

(6) a computed frequency of one or more column values, 

(7) a computed frequency of values for pairs of columns in a column list, 

(8) a Pearson Product-Moment Correlation matrix, 

(9) a Govariance matrix, 

(10) a sum of squares and cross-products matrix, or 

(1 1) a count of overlapping column values in one or more combinations of tables, 

32. The system of claim 29, wherein the Data Derivation functions provide column 
derivations or transformations . 

33 . The system of claim 29, wherein the Data Derivation functions comprise: 

(1) a derived binned numeric column wherein a new column is bin number, 

(2) a n-valued categorical column dummy-coded into "u" 0/ 1 values, 

(3) a n-valued categorical column recoded into n or less new values, 

(4) one or more numeric columns scaled via range transformation, 

(5) one or more columns scaled to a z-score that is a number of standard deviations 
from a mean, 

(6) one or more numeric columns scaled via a sigmoidal transformation function* 
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(7) one or more numeric columns scaled via a base 10 logarithm function, 

(8) one or more numeric columns scaled via a natural logarithm function, 

(9) one or more numeric columns scaled via an exponential function, 

(10) one or more numeric columns raised to a specified power, 

(11) one or more numeric columns derived via user defined transformation function, 

(12) one or more new columns derived by ranking one or more columns or expressions 
based on order, 

(13) one or more new columns derived with quanrile 0 to n-1 based on order and n, 

(14) a cumulative sum of a value expression based on a sort expression, 

(15) a moving average of a value expression based on a width and order, 

(16) a moving sum of a value expression based on a width and order, 

(17) a moving difference of a value expression based on a width and order, 

(1 8) a moving linear regression value derived from an expression, width, and order, 

(19) a multiple account/ product ownership bitmap, 

(20) a product ownership bitmap over multiple rime periods, 

(21) one or more counts, amount, percentage means and intensities derived from a 
transaction summary, 

(22) one or more variabilities derived from transaction summary data, 

(23) one or more derived trigonometric values and their inverses, including sin, arcsin, 
cos, arccos, esc, arccsc, sec, arcsec, tan, arctan, cot, and arccot, or 

(24) one or more derived hyperbolic values and their inverses, including sinh, arcsinh, 
cosh, arccosh, csch, arccsch, sech, arcsech, tanh, arctanh, coth, and arccoth. 

34. The system of claim 29, wherein the Data Reduction functions provide matrix 
building operations to reduce the amount of data required for analytic algorithms. 

35. The system of claim 29, wherein the Data Reduction functions comprise; 

(1) build one or more data reduction matrices from a group comprising: @ a Pearson- 
Product Moment Correlations matrix; (u) a Covariances matrix; and (iii) a Sum of 
Squares and Cross Products (SSCP) matrix, 

(2) export a resultant matrix, or 

(3) restart a matrix operation. 
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36. The system of claim 29, wherein the Data Reorganization functions provide an 
bility to reorganize data by joining or de-normalizing pre-processed results into a wide analytic data 



al 
set. 



37. The system of claim 29, wherein the Data Reorganization functions comprise; 

(1) create a de- normalized new table by removing one or more key columns, or 

(2) join a plurality of tables or views into a combined result table. 

38. The system of claim 29, wherein the Data Sampling function provides an ability to 
construct a new table containing a randomly selected subset of the rows in an existing table or view. 

39. The system of claim 29, wherein the Data Sampling function selects one or more 
data samples of specified sizes from a table, 

40. The system of claim 29, wherein the Data Partitioning function provides an ability to 
construct a new table containing at least one randomly selected subset of the rows in an existing 
table or view, wherein the subsets are mutually distinct but all-inclusive subsets of data. 

41. The system of claim 29, wherein the Data Partitioning function selects one or more 
data partitions from a table using a database internal hashing technique. 

42. The system of claim 24, wherein results of the data mining operations are stored in 
the relational databases. 

43. The system of claim 24, wherein the relational database management system further 
comprises an analytical logical data model that stores metadata and processing results from the 
Scalable Data Mining Functions. 

44. A method for performing data mining applications, comprising: 

(a) storing a relational database on one or more data storage devices connected to a 
computer, 
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(b) accessing the relational database stored on the data storage devices using a relational 
database management system; and 

(c) executing an analytic application programming interface (API) that generates a set of 
scalable data mining functions including queriesfor execution by the relational database 
management system, for performing data mining operations directly within the database 
management system. 

45. An article of manufacture comprising logic embodying a method for performing data 
rnining applications, comprising: 

(a) storing a relational database on one or more data storage devices connected to a 

computer; 

(b) accessing the relational database stored on the data storage devices using a relational 
database management system; and 

(c) executing an analytic application programming interface (API) that generates a set of 
scalable data rnining functions including queries for execution by the relational database 
management system, for performing data mining operations directly within the database 
management system. 

46. The method of claim 44 above, wherein the computer comprises a parallel 
processing computer comprised of a plurality of nodes, and each node executes one or more threads 
of the relational database management system to provide parallelism in the data niiijing operations. 

47. The method of claim 44, wherein the scalable data mining functions process data 
collections stored in the relational database and produce results that are stored in the relational 
database. 

48. The method of claim 44, wherein the scalable data mining functions are created by 
parameterizing and instantiating the analytic API. 

49. The method of claim 44, wherein the scalable data mining functions are dynamically 
generated queries comprised of combined phrases with substituting values therein based on 
parameters supplied to the analytic API. 
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50. The method of claim 49, wherein the scalable data coining functions comprise Data 
Description functions, Data Derivation functions, Data Reduction functions, Data Reorganization 
functions, Data Sampling functions, and or Data Partitioning functions. 

51. The method of claim 50, wherein the Data Description functions comprise 
descriptive statistical functions. 

52. The method of claim 50, wherein the Data Description functions comprise: 

(1) descriptive statistics for one or more numeric columns, wherein the statistics are 
selected from a group comprising count, minimum, maximum, mean, standard 
deviation, standard mean error, variance, coefficient of variance, skewness, kurtosis, 
uncorrected sum of squares, corrected sum of squares, and quantiles, 

(2) a count of values for a column, 

(3) a calculated modality for a column, 

(4) one or more bin numeric columns of counts with overlay and statistics options, 

(5) one or more automatically sub-binned numeric columns giving additional counts and 
isolated frequently occurring individual values 

(6) a computed frequency of one or more column values, 

(7) a computed frequency of values for pairs of columns in a column list, 

(8) a Pearson Product-Moment Correlation matrix, 

(9) a Covariance matrix, 

(10) a sum of squares and cross-products matrix, or 

(1 1) a count of overlapping column values in one or more combinations of tables. 

53. The method of claim 50, wherein the Data Derivation functions provide column 
derivations or transformations. 

54. The method of claim 50, wherein the Data Derivation functions comprise: 

(1) a derived binned numeric column wherein a new column is bin number, 

(2) a n-vahied categorical column duxnnrj^coded into "n* '0/1 values, 

(3) a n-valued categorical column Decoded into n or less new values, 
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(4) one or more numeric columns scaled via range transformation, 

(5) one or more columns scaled to a z-score that is a number of standard deviations 
from a mean, 

(6) one or more numeric columns scaled via a sigmoidal transf onnation function, 

(7) one or more numeric columns scaled via a base 10 logarithm function, 

(8) one or more numeric columns scaled via a natural logarithm function, 

(9) one or more numeric columns scaled via an exponential function, 

(10) one or more numeric columns raised to a specified power, 

(11) one or more numeric columns derived via user defined transformation function, 

(12) one or more new columns derived by ranking one or more columns or expressions 
based on order, 

(13) one or more new columns derived with quantik 0 to n-1 based on order and n, 

(14) a cumulative sum of a value expression based on a sort expression, 

(15) a moving average of a value expression based on a width and order, 

(1 6) a moving sum of a value expression based on a width and order , 

(17) a moving difference of a value expression based on a width and order, 

(18) a moving linear regression value derived from an expression, width, and order, 

(19) a multiple account/product ownership bitmap, 

(20) a product ownership bitmap over multiple time periods, 

(21) one or more counts, amount, percentage means and intensities derived from a 
transaction summary, 

(22) one or more variabilities derived from transaction summary data, 

(23) one or more derived trigonometric values and their inverses, including sin, arcsin, 
cos, arccos, esc, arccsc, sec, arcsec, tan, arctan, cot, and arccot, or 

(24) one or more derived hyperbolic values and their inverses, including sinh, arcsinh, 
cosh, arccosh, csch, arccsch, sech, arcsech, tanh, arctanh, coth, and arccoth. 

55. The method of claim 50, wherein the Data Reduction functions provide matrix 
building operations to reduce the amount of data required for analytic algorithms. 

56. The method of claim 50, wherein the Data Reduction functions comprise: 
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(1) build one or more data reduction matrices from a group comprising: (i) a Pearson- 
Product Moment Correlations matrix; (Li) a Covariances matrix; and (S) a Sum of 
Squares and Cross Products (SSCP) matrix, 

(2) export a resultant matrix, and or 

(3) restart a matrix operation. 

57. Hie method of claim 50, -wherein the Data Reorganization functions provide an 
ability to reorganize data by joining or de-noimalizing pre-processed results into a wide analytic data 
set. 

58. The method of claim 50, -wherein the Data Reorganization functions comprise; 

(1) create a de-normalized new table by removing one or more key columns, or 

(2) join a plurality of tables or views into a combined result table. 

59. The method of claim 50, -wherein the Data Sampling function provides an ability to 
construct a new table containing a randomly selected subset of the rows in an existing table or view. 

60. The method of claim 50, wherein the Data Sampling function selects one or more 
data samples of specified sizes from a table. 

61 . The method of claim 50, wherein the Data Partitioning function provides an ability 
to construct a new table containing at least one randomly selected subset of the rows in an existing 
table or view, wherein the subsets are mutually distinct but all-inclusive subsets of data. 

62. The method of claim 50, wherein the Data Partitioning function selects one or more 
data partitions from a table using a database internal hashing technique. 

63. The method of claim 44, wherein results of the data mining operations are stored in 
the relational databases. 
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64. The method of claim 44, wherein the relational database management system further 
comprises an analytical logical data model that stores metadata and processing results from the 
Scalable Data Mining Functions. 

65. The article of claim 45 above, wherein the computer comprises a parallel processing 
computer comprised of a plurality of nodes, and each node executes one or more threads of the 
relational database management system to provide parallelism in the data niining operations. 

66. The article of claim 45, wherein the scalable data mining functions process data 
collections stored in the relational database and produce results that aie stored in the relational 

. database. 

67. The article of claim 45, wherein the scalable data mining functions are created by 
parameterizing and instantiating the analytic API. 

68. The article of claim 45, wherein me scalable data mining functions are dynamically 
generated queries comprised of combined phrases with substituting values therein based on 
parameters supplied to the analytic API. 

69. The article of claim 68, wherein the scalable data inining functions comprise Data 
Description functions, Data Derivation functions, Data Reduction functions, Data Reorganization 
functions, Data Sampling functions, and or Data Partitioning functions. 

70. The article of claim 69, wherein the Data Description functions comprise descriptive 
statistical functions. 

71. The article of claim 69, wherein the Data Description functions comprise: 

(1) descriptive statistics for one or more numeric columns, wherein the statistics are 
selected from a group comprising count, minimum, maximum, mean, standard 
deviation, standard mean error, variance, coefficient of variance, skewness, kurtosis, 
uncorrected sum of squares, corrected sum of squares, and quanriles, 

(2) a count of values for a column, 

33 

PAGE 36139 ' RCVD AT 1112120(15 6:57:10 PM [Eastern Standard Time] * SVR:USPT0-EFXRF-1/2 * DNIS:8729306 * CSID:+1 31 06418798 * DURATION (mm-ss):10-26 



01-1 2t2005 04: 18PM F ROM-Gates & Cooper LLP 



+13106418798 



T-895 P. 037/039 F-062 



(3) a calculated modality for a column, 

(4) one or more bin numeric columns of counts with overlay and statistics options, 

(5) one or more automatically sub-binned numeric columns giving additional counts and 
isolated frequently occurring individual values 

(6) a computed frequency of one or more column values, 

(7) a computed frequency of values for pairs of columns in a column list, 

(8) a Pears on Product-Moment Correlation matrix, 

(9) a Covariance matrix, 

(10) a sum of squares and cross-products matrix, or 

(11) a count of overlapping column values in one or more combinations of tables. 

72. The article of claim 69, wherein the Data Derivation functions provide column 
derivations or transformations. 

73. The article of claim 69, wherein the Data Derivation functions comprise: 

(1) a derived binned numeric column wherein a new column is bin number, 

(2) a n- valued categorical column dummy-coded into "n" 0/ 1 values, 

(3) a n- valued categorical column receded into n or less new values, 

(4) one or more numeric columns scaled via range transformation, 

(5) one or more columns scaled to a z-score that is a number of standard deviations 
from a mean, 

(6) one or more numeric columns scaled via a sigmoidal transformation function, 

(7) one or more numeric columns scaled via a base 10 logarithm function, 

(8) one or more numeric columns scaled via a natural logarithm function, 

(9) one or more numeric columns scaled via an exponential function, 

(10) one or more numeric columns raised to a specified power, 

(11) one or more numeric columns derived via user defined uransfonnation function, 

(12) one or more new columns derived by ranking one or more columns or expressions 
based on order, 

(13) one or more new columns derived with quanrile 0 to n- 1 based on order and n, 

(14) a cumulative sum of a value expression based on a sort expression, 

(15) a moving average of a value expression based on a width and order, 
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(16) a moving sum of a value expression based on a -width and order, 

(17) a moving difference of a value expression based on a width and order, 

(18) a moving linear regression value derived from an expression, width, and order, 

(19) a multiple account/product ownership bitmap, 

(20) a product ownership bitmap over multiple time periods, 

(21) one or more counts, amount, percentage means and intensities derived from a 
transaction summary, 

(22) one or more variabilities derived from transaction summary data, 

(23) one or more derived trigonometric values and their inverses, including sin, arcsin, 
cos, arccos, esc, arccsc, sec, arcsec, tan, arctan, cot, and arccot, or 

(24) one or more derived hyperbolic values and their inverses, including sinh, arcsinh, 
cosh, arccosh, csch, arccsch, sech, arcsech, tanh, arctanh, coth, and arccoth. 

74. The article of claim 69, wherein the Data Reduction functions provide matrix 
building operations to reduce the amount of data required for analytic algorithms. 

75. The article of claim 69, wherein the Data Reduction functions comprise: 

(1) build one or more data reduction matrices from a group comprising: Q a Pearson- 
Product Moment Correlations matrix; (ii) a Covariances matrix; and (w) a Sum of 
Squares and Cross Products (SSCP) matrix, 

(2) export a resultant matrix, and or 

(3) restart a matrix operation. 

76. The article of claim 69, wherein the Data Reorganization functions provide an ability 
to reorganize dau by joining or de-normalizing pre-processed results into a wide analytic data set. 

77. The article of claim 69, vAerem me Dau Reorgamzarion functbns comprise: 

(1) create a de-normalized new table by removing one or more key columns, or 

(2) join a plurality of tables or views into a combined result table. 

78. The article of claim 69, wherein the Data Sampling function provides an ability to 
construct a new table containing a randomly selected subset of the rows in an existing table or view. 
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79. The article of claim 69, wherein the Data Sampling function selects one or more data 
samples of specified sizes from a table. 

80. The article of claim 69, wherein the Data Partitioning function provides an ability to 
construct a new table containing at least one randomly selected subset of the rows in an existing 
table or view, wherein the subsets are mutually distinct but all-inclusive subsets of data. 

81. The article of claim 69, wherein the Data Partitioning function selects one or more 
data partitions from a table using a database internal hashing technique. 

82. The article of claim 45, wherein results of the data mining operations are stored in 
the relational databases. 

83. The article of claim 45, wherein the relational database management system further 
comprises an analytical logical data model that stores metadata and processing results from the 
Scalable Data Mining Functions* 
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